700 research outputs found
Will the Home Team Win? On the Road to 1.5 Billion Tweets and Six Thousand Baseball Games Providing Insight!!!
Researchers operate with limited budgets and inadequate resources. This prohibits big data research and suppresses innovation needed to direct inquiry and construct robust research-based information systems. Such issues are not insuperable, e.g., this project is initialized with limited resources and attempts to build theory, describe architecture, and set the vision for future work. This first “On the Road to …” paper tenders a methodology that examines the use of social media variables as a proxy for human emotion and epistemic activity. A social media corpus is processed and a regression model considers MLB team wins as the dependent variable and a social media tweet corpus, operationalized via NLP, as the independent variable. Results are presented. Future work describes a predictive GIS artifact that will input, process, and visualize a spatial and time-based, NLP processed, social media corpus and is integration with geospatial indexing
Social Media Operationalized for GIS: The Prequel
With social media a de facto global communication channel used to disseminate news, entertainment, and one’s self-revelations, the latter contains double-talk, peculiar insight, and contextual observation about real-world events. The primary objective is to propose a novel pipeline to classify a tweet as either “useful” or “not useful” by using widely-accepted Natural Language Processing (NLP) techniques, and measure the effect of such method based on the change in performance of a Geographical Information System (GIS) artifact. A 1,000 tweet sample is manually tagged and compared to an innovative social media grammar applied by a rule-based social media NLP pipeline. Evaluation underpins answering, prior to content analysis of a tweet, does a method exist to support identifying a tweet as “useful” for subsequent processing? Indeed, “useful” tweet identification via NLP returned precision of 0.9256, recall of 0.6590, and F-measure of 0.7699; consequently GIS social media processing increased 0.2194 over baseline
Recommended from our members
GIS Investigation of Crime Prediction with an Operationalized Tweet Corpus
Social media as the de facto communication channel is being used to disseminate one’s diurnal self-revelations. This profound discovery often contains double-talk, peculiar insights, or contextual information about real-world events. Natural language processing is regularly used to uncover both obvious and latent knowledge claims within disclosures published amid the complex environment. For example, a perpetrator with first-hand knowledge of their criminal incident uses social media to post critical information about it. A geographic information system (GIS) is capable of large-scale point data analysis and possesses methods that enable dataset processing, evaluation, and automatic spatial visualization. Such an artifact—fused with traditional environmental criminology theory and social media—erects guidelines, tools, and models for substantive construction and evaluation of GIS crime analysis solutions. Provided the social media stream is timely and correctly processed, corrective action can be taken. The construction of a natural language processing social media annotation pipe identifies latent indicators extracted from a social media corpus and is an integral part of societal mishap prediction. Spatial visualizations and regression analyses were used to describe and evaluate project artifacts. As a result, a social media corpus was operationalized, and subsequently used as a proxy for a traditional environmental criminology risk layer in construction of a social media GIS crime analysis artifact. Using such multi-domain collaboration, the artifact was able to increase the predictive crime incident outcome with an overall R-squared increase of 21.94%. This result is the state-of-the-art; there are no other results to compare it to
Leveraging Compositional Methods for Modeling and Verification of an Autonomous Taxi System
We apply a compositional formal modeling and verification method to an
autonomous aircraft taxi system. We provide insights into the modeling approach
and we identify several research areas where further development is needed.
Specifically, we identify the following needs: (1) semantics of composition of
viewpoints expressed in different specification languages, and tools to reason
about heterogeneous declarative models; (2) libraries of formal models for
autonomous systems to speed up modeling and enable efficient reasoning; (3)
methods to lift verification results generated by automated reasoning tools to
the specification level; (4) probabilistic contract frameworks to reason about
imperfect implementations; (5) standard high-level functional architectures for
autonomous systems; and (6) a theory of higher-order contracts. We believe that
addressing these research needs, among others, could improve the adoption of
formal methods in the design of autonomous systems including learning-enabled
systems, and increase confidence in their safe operations.Comment: 2023 International Conference on Assured Autonomy (ICAA
Big Social Data and GIS: Visualize Predictive Crime
Social media is a desirable Big Data source used to examine the relationship between crime and social behavior. Observation of this connection is enriched within a geographic information system (GIS) rooted in environmental criminology theory, and produces several different results to substantiate such a claim. This paper presents the construction and implementation of a GIS artifact producing visualization and statistical outcomes to develop evidence that supports predictive crime analysis. An information system research prototype guides inquiry and uses crime as the dependent variable and a social media tweet corpus, operationalized via natural language processing, as the independent variable. This inescapable realization of social media as a predictive crime variable is prudent; researchers and practitioners will better appreciate its capability. Inclusive visual and statistical results are novel, represent state-of-the-art predictive analysis, increase the baseline R2 value by 7.26%, and support future predictive crime-based research when front-run with real-time social media
Toward Predictive Crime Analysis via Social Media, Big Data, and GIS Spatial Correlation
To support a dissertation proposal a link between social media, incident-based crime data, and data of public domain needed to be verified. A predictive crime-based artifact utilizing data mining and natural language processing techniques commingled with graphical information system architecture is complex. With respect to social media, an attempt was made to observe such an artifact’s data flexibility, process control, and predictive capabilities. Data and their capabilities were observed when preprocessing social media’s noisy data, government-based structured data, and obscurely collected field data for use in a predictive GIS artifact. To support project goals the approach for artifact design, data collection, and discussion of results was couched as an exploratory study. Results indicate a link between social media data and domain specific datasets exist. Questions for further observation and research deal with processing the subtle differences between structured and noisy data, weighted social media input layers, and time-series analysis.ye
BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations
Real-world planning problems\unicode{x2014}including autonomous driving and
sustainable energy applications like carbon storage and resource
exploration\unicode{x2014}have recently been modeled as partially observable
Markov decision processes (POMDPs) and solved using approximate methods. To
solve high-dimensional POMDPs in practice, state-of-the-art methods use online
planning with problem-specific heuristics to reduce planning horizons and make
the problems tractable. Algorithms that learn approximations to replace
heuristics have recently found success in large-scale problems in the fully
observable domain. The key insight is the combination of online Monte Carlo
tree search with offline neural network approximations of the optimal policy
and value function. In this work, we bring this insight to partially observed
domains and propose BetaZero, a belief-state planning algorithm for POMDPs.
BetaZero learns offline approximations based on accurate belief models to
enable online decision making in long-horizon problems. We address several
challenges inherent in large-scale partially observable domains; namely
challenges of transitioning in stochastic environments, prioritizing action
branching with limited search budget, and representing beliefs as input to the
network. We apply BetaZero to various well-established benchmark POMDPs found
in the literature. As a real-world case study, we test BetaZero on the
high-dimensional geological problem of critical mineral exploration.
Experiments show that BetaZero outperforms state-of-the-art POMDP solvers on a
variety of tasks.Comment: 20 page
- …